Overview

Dataset statistics

Number of variables18
Number of observations24078
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 MiB
Average record size in memory144.0 B

Variable types

Numeric10
Categorical8

Alerts

Attempted is highly overall correlated with FullHigh correlation
NumberOfMajors is highly overall correlated with NumberOfUniqueMajorsHigh correlation
NumberOfUniqueMajors is highly overall correlated with NumberOfMajorsHigh correlation
Full is highly overall correlated with AttemptedHigh correlation
Ethnicity is highly overall correlated with IsHispanicHigh correlation
IsHispanic is highly overall correlated with EthnicityHigh correlation
Dev is highly imbalanced (75.6%)Imbalance
Ethnicity is highly imbalanced (54.7%)Imbalance
IsHispanic is highly imbalanced (60.5%)Imbalance
Internet has 11642 (48.4%) zerosZeros
PercentageOfRepeats has 20092 (83.4%) zerosZeros
PercentageOfHistDrop has 16928 (70.3%) zerosZeros
CumGPALast has 370 (1.5%) zerosZeros
PercentageOfAbsence has 8873 (36.9%) zerosZeros

Reproduction

Analysis started2023-05-18 19:02:21.353741
Analysis finished2023-05-18 19:02:57.358801
Duration36.01 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

StudentID
Real number (ℝ)

Distinct11440
Distinct (%)47.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.375135 × 109
Minimum1.0100118 × 108
Maximum1.0112752 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:02:57.610452image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.0100118 × 108
5-th percentile1.9500906 × 108
Q15.5600459 × 108
median1.0111105 × 1010
Q31.0111117 × 1010
95-th percentile1.011274 × 1010
Maximum1.0112752 × 1010
Range1.0011751 × 1010
Interquartile range (IQR)9.5551128 × 109

Descriptive statistics

Standard deviation4.7022347 × 109
Coefficient of variation (CV)0.73758983
Kurtosis-1.780313
Mean6.375135 × 109
Median Absolute Deviation (MAD)1630342
Skewness-0.46538127
Sum1.535005 × 1014
Variance2.2111012 × 1019
MonotonicityNot monotonic
2023-05-18T14:02:57.977845image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.011110424 × 101011
 
< 0.1%
1.011110321 × 101010
 
< 0.1%
673006251 10
 
< 0.1%
396009807 10
 
< 0.1%
1.011110503 × 10109
 
< 0.1%
414008883 9
 
< 0.1%
1.011110491 × 10109
 
< 0.1%
712001867 9
 
< 0.1%
716002765 9
 
< 0.1%
252003703 9
 
< 0.1%
Other values (11430) 23983
99.6%
ValueCountFrequency (%)
101001184 2
 
< 0.1%
101003030 3
< 0.1%
101004016 1
 
< 0.1%
101009897 3
< 0.1%
102001082 1
 
< 0.1%
102002426 2
 
< 0.1%
102002757 4
< 0.1%
102004142 2
 
< 0.1%
102007458 1
 
< 0.1%
102008114 5
< 0.1%
ValueCountFrequency (%)
1.011275214 × 10101
< 0.1%
1.011275214 × 10101
< 0.1%
1.01127521 × 10101
< 0.1%
1.011275208 × 10101
< 0.1%
1.011275207 × 10101
< 0.1%
1.011275207 × 10101
< 0.1%
1.011275203 × 10101
< 0.1%
1.011275201 × 10101
< 0.1%
1.011275201 × 10101
< 0.1%
1.011275201 × 10101
< 0.1%

Attempted
Real number (ℝ)

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.2604868
Minimum1
Maximum26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:02:58.277255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q16
median9
Q313
95-th percentile16
Maximum26
Range25
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.1064977
Coefficient of variation (CV)0.44344296
Kurtosis-0.42455466
Mean9.2604868
Median Absolute Deviation (MAD)3
Skewness0.35569483
Sum222974
Variance16.863323
MonotonicityNot monotonic
2023-05-18T14:02:58.543194image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
6 3608
15.0%
8 2592
10.8%
13 2468
10.3%
3 2370
9.8%
9 2360
9.8%
12 2112
8.8%
10 1527
6.3%
7 1455
6.0%
4 1205
 
5.0%
14 914
 
3.8%
Other values (16) 3467
14.4%
ValueCountFrequency (%)
1 18
 
0.1%
2 42
 
0.2%
3 2370
9.8%
4 1205
 
5.0%
5 101
 
0.4%
6 3608
15.0%
7 1455
6.0%
8 2592
10.8%
9 2360
9.8%
10 1527
6.3%
ValueCountFrequency (%)
26 1
 
< 0.1%
25 7
 
< 0.1%
24 4
 
< 0.1%
23 21
 
0.1%
22 23
 
0.1%
21 58
 
0.2%
20 53
 
0.2%
19 314
1.3%
18 166
0.7%
17 251
1.0%

Full
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
0
17050 
1
7028 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters24078
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Length

2023-05-18T14:02:58.812130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:02:59.051547image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Most occurring characters

ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24078
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Most occurring scripts

ValueCountFrequency (%)
Common 24078
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 17050
70.8%
1 7028
29.2%

Age
Real number (ℝ)

Distinct58
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.934255
Minimum14
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:02:59.260298image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile17
Q118
median20
Q326
95-th percentile43
Maximum75
Range61
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.676219
Coefficient of variation (CV)0.36250215
Kurtosis3.5002275
Mean23.934255
Median Absolute Deviation (MAD)2
Skewness1.8893765
Sum576289
Variance75.276776
MonotonicityNot monotonic
2023-05-18T14:02:59.511717image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18 3608
15.0%
19 3197
13.3%
20 2505
 
10.4%
17 2439
 
10.1%
21 1801
 
7.5%
22 1193
 
5.0%
23 912
 
3.8%
24 717
 
3.0%
25 616
 
2.6%
26 540
 
2.2%
Other values (48) 6550
27.2%
ValueCountFrequency (%)
14 1
 
< 0.1%
15 135
 
0.6%
16 483
 
2.0%
17 2439
10.1%
18 3608
15.0%
19 3197
13.3%
20 2505
10.4%
21 1801
7.5%
22 1193
 
5.0%
23 912
 
3.8%
ValueCountFrequency (%)
75 1
 
< 0.1%
70 2
 
< 0.1%
69 1
 
< 0.1%
68 4
 
< 0.1%
67 7
 
< 0.1%
66 3
 
< 0.1%
65 10
< 0.1%
64 7
 
< 0.1%
63 8
 
< 0.1%
62 21
0.1%

Dev
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
0
22004 
1
 
1702
2
 
340
3
 
32

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters24078
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Length

2023-05-18T14:02:59.811459image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:00.078153image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24078
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 24078
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 22004
91.4%
1 1702
 
7.1%
2 340
 
1.4%
3 32
 
0.1%

Internet
Real number (ℝ)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1032062
Minimum0
Maximum9
Zeros11642
Zeros (%)48.4%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:00.288320image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.407834
Coefficient of variation (CV)1.2761295
Kurtosis1.4645788
Mean1.1032062
Median Absolute Deviation (MAD)1
Skewness1.3628079
Sum26563
Variance1.9819966
MonotonicityNot monotonic
2023-05-18T14:03:00.512458image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 11642
48.4%
1 5074
21.1%
2 3528
 
14.7%
3 1941
 
8.1%
4 1147
 
4.8%
5 522
 
2.2%
6 164
 
0.7%
7 54
 
0.2%
8 4
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
0 11642
48.4%
1 5074
21.1%
2 3528
 
14.7%
3 1941
 
8.1%
4 1147
 
4.8%
5 522
 
2.2%
6 164
 
0.7%
7 54
 
0.2%
8 4
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
9 2
 
< 0.1%
8 4
 
< 0.1%
7 54
 
0.2%
6 164
 
0.7%
5 522
 
2.2%
4 1147
 
4.8%
3 1941
 
8.1%
2 3528
 
14.7%
1 5074
21.1%
0 11642
48.4%

PercentageOfRepeats
Real number (ℝ)

Distinct37
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.091595636
Minimum0
Maximum1
Zeros20092
Zeros (%)83.4%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:00.854192image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.71428571
Maximum1
Range1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2391691
Coefficient of variation (CV)2.6111408
Kurtosis6.7928142
Mean0.091595636
Median Absolute Deviation (MAD)0
Skewness2.776851
Sum2205.4397
Variance0.057201859
MonotonicityNot monotonic
2023-05-18T14:03:01.205571image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
0 20092
83.4%
1 951
 
3.9%
0.5 547
 
2.3%
0.25 426
 
1.8%
0.3333333333 415
 
1.7%
0.2 371
 
1.5%
0.6666666667 293
 
1.2%
0.4 261
 
1.1%
0.1666666667 154
 
0.6%
0.75 125
 
0.5%
Other values (27) 443
 
1.8%
ValueCountFrequency (%)
0 20092
83.4%
0.09090909091 1
 
< 0.1%
0.1 1
 
< 0.1%
0.1111111111 1
 
< 0.1%
0.125 14
 
0.1%
0.1428571429 42
 
0.2%
0.1666666667 154
 
0.6%
0.2 371
 
1.5%
0.2222222222 8
 
< 0.1%
0.25 426
 
1.8%
ValueCountFrequency (%)
1 951
3.9%
0.9 1
 
< 0.1%
0.8888888889 3
 
< 0.1%
0.875 7
 
< 0.1%
0.8571428571 15
 
0.1%
0.8333333333 23
 
0.1%
0.8 66
 
0.3%
0.7777777778 2
 
< 0.1%
0.75 125
 
0.5%
0.7272727273 2
 
< 0.1%

PercentageOfHistDrop
Real number (ℝ)

Distinct302
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.055235649
Minimum0
Maximum1
Zeros16928
Zeros (%)70.3%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:01.528876image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.058823529
95-th percentile0.28571429
Maximum1
Range1
Interquartile range (IQR)0.058823529

Descriptive statistics

Standard deviation0.12639639
Coefficient of variation (CV)2.2883119
Kurtosis20.678625
Mean0.055235649
Median Absolute Deviation (MAD)0
Skewness3.9113459
Sum1329.964
Variance0.015976048
MonotonicityNot monotonic
2023-05-18T14:03:02.129817image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16928
70.3%
0.2 358
 
1.5%
0.25 325
 
1.3%
0.1666666667 269
 
1.1%
0.3333333333 261
 
1.1%
0.1428571429 256
 
1.1%
0.1111111111 243
 
1.0%
0.125 239
 
1.0%
0.5 223
 
0.9%
0.1 203
 
0.8%
Other values (292) 4773
 
19.8%
ValueCountFrequency (%)
0 16928
70.3%
0.01754385965 2
 
< 0.1%
0.01886792453 1
 
< 0.1%
0.01923076923 1
 
< 0.1%
0.01960784314 1
 
< 0.1%
0.02 3
 
< 0.1%
0.02040816327 3
 
< 0.1%
0.02083333333 3
 
< 0.1%
0.02127659574 3
 
< 0.1%
0.02173913043 3
 
< 0.1%
ValueCountFrequency (%)
1 144
0.6%
0.8333333333 2
 
< 0.1%
0.8 4
 
< 0.1%
0.75 10
 
< 0.1%
0.7142857143 2
 
< 0.1%
0.7 1
 
< 0.1%
0.6666666667 33
 
0.1%
0.6363636364 1
 
< 0.1%
0.625 3
 
< 0.1%
0.6153846154 2
 
< 0.1%

TermCode
Categorical

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
B17C
3221 
B18C
3138 
B17Q
2551 
B19C
2079 
B23C
1962 
Other values (8)
11127 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters96312
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB17C
2nd rowB18Q
3rd rowB17C
4th rowB17C
5th rowB19Q

Common Values

ValueCountFrequency (%)
B17C 3221
13.4%
B18C 3138
13.0%
B17Q 2551
10.6%
B19C 2079
8.6%
B23C 1962
8.1%
B21C 1843
7.7%
B22C 1704
7.1%
B19Q 1611
6.7%
B18Q 1584
6.6%
B22Q 1412
5.9%
Other values (3) 2973
12.3%

Length

2023-05-18T14:03:02.453196image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b17c 3221
13.4%
b18c 3138
13.0%
b17q 2551
10.6%
b19c 2079
8.6%
b23c 1962
8.1%
b21c 1843
7.7%
b22c 1704
7.1%
b19q 1611
6.7%
b18q 1584
6.6%
b22q 1412
5.9%
Other values (3) 2973
12.3%

Most occurring characters

ValueCountFrequency (%)
B 24078
25.0%
1 17412
18.1%
C 14141
14.7%
2 13010
13.5%
Q 9937
10.3%
7 5772
 
6.0%
8 4722
 
4.9%
9 3690
 
3.8%
3 1962
 
2.0%
0 1588
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 48156
50.0%
Decimal Number 48156
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 17412
36.2%
2 13010
27.0%
7 5772
 
12.0%
8 4722
 
9.8%
9 3690
 
7.7%
3 1962
 
4.1%
0 1588
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
B 24078
50.0%
C 14141
29.4%
Q 9937
20.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 48156
50.0%
Common 48156
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 17412
36.2%
2 13010
27.0%
7 5772
 
12.0%
8 4722
 
9.8%
9 3690
 
7.7%
3 1962
 
4.1%
0 1588
 
3.3%
Latin
ValueCountFrequency (%)
B 24078
50.0%
C 14141
29.4%
Q 9937
20.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 96312
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 24078
25.0%
1 17412
18.1%
C 14141
14.7%
2 13010
13.5%
Q 9937
10.3%
7 5772
 
6.0%
8 4722
 
4.9%
9 3690
 
3.8%
3 1962
 
2.0%
0 1588
 
1.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
0
20861 
1
3217 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters24078
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

Length

2023-05-18T14:03:02.696986image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:02.952371image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

Most occurring characters

ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24078
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

Most occurring scripts

ValueCountFrequency (%)
Common 24078
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 20861
86.6%
1 3217
 
13.4%

NumberOfMajors
Real number (ℝ)

Distinct27
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.5774981
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:03.171438image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile12
Maximum36
Range35
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.3878038
Coefficient of variation (CV)0.60740563
Kurtosis2.4495501
Mean5.5774981
Median Absolute Deviation (MAD)2
Skewness1.3895451
Sum134295
Variance11.477214
MonotonicityNot monotonic
2023-05-18T14:03:03.421495image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
4 4060
16.9%
2 3877
16.1%
3 3439
14.3%
5 2964
12.3%
6 2169
9.0%
7 1805
7.5%
8 1454
 
6.0%
9 1002
 
4.2%
10 908
 
3.8%
11 653
 
2.7%
Other values (17) 1747
7.3%
ValueCountFrequency (%)
1 171
 
0.7%
2 3877
16.1%
3 3439
14.3%
4 4060
16.9%
5 2964
12.3%
6 2169
9.0%
7 1805
7.5%
8 1454
 
6.0%
9 1002
 
4.2%
10 908
 
3.8%
ValueCountFrequency (%)
36 1
 
< 0.1%
26 1
 
< 0.1%
25 3
 
< 0.1%
24 4
 
< 0.1%
23 4
 
< 0.1%
22 14
 
0.1%
21 17
 
0.1%
20 26
0.1%
19 44
0.2%
18 50
0.2%

NumberOfUniqueMajors
Real number (ℝ)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5147853
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:03.679565image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7413366
Coefficient of variation (CV)0.48940045
Kurtosis2.1854944
Mean1.5147853
Median Absolute Deviation (MAD)0
Skewness1.4849018
Sum36473
Variance0.54957995
MonotonicityNot monotonic
2023-05-18T14:03:03.920115image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 14667
60.9%
2 6977
29.0%
3 1953
 
8.1%
4 418
 
1.7%
5 57
 
0.2%
6 6
 
< 0.1%
ValueCountFrequency (%)
1 14667
60.9%
2 6977
29.0%
3 1953
 
8.1%
4 418
 
1.7%
5 57
 
0.2%
6 6
 
< 0.1%
ValueCountFrequency (%)
6 6
 
< 0.1%
5 57
 
0.2%
4 418
 
1.7%
3 1953
 
8.1%
2 6977
29.0%
1 14667
60.9%

CumGPALast
Real number (ℝ)

Distinct3939
Distinct (%)16.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0147963
Minimum0
Maximum4
Zeros370
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:04.213413image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.5
Q12.563825
median3.125
Q33.6471
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)1.083275

Descriptive statistics

Standard deviation0.83323746
Coefficient of variation (CV)0.27638267
Kurtosis1.8044744
Mean3.0147963
Median Absolute Deviation (MAD)0.5417
Skewness-1.1971195
Sum72590.267
Variance0.69428467
MonotonicityNot monotonic
2023-05-18T14:03:04.571848image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 3393
 
14.1%
3 1989
 
8.3%
3.5 943
 
3.9%
2 869
 
3.6%
2.5 555
 
2.3%
0 370
 
1.5%
3.25 305
 
1.3%
3.75 288
 
1.2%
3.3333 246
 
1.0%
3.6667 246
 
1.0%
Other values (3929) 14874
61.8%
ValueCountFrequency (%)
0 370
1.5%
0.0909 1
 
< 0.1%
0.1364 1
 
< 0.1%
0.1429 3
 
< 0.1%
0.15 1
 
< 0.1%
0.1667 1
 
< 0.1%
0.1765 1
 
< 0.1%
0.1818 2
 
< 0.1%
0.1875 1
 
< 0.1%
0.2 2
 
< 0.1%
ValueCountFrequency (%)
4 3393
14.1%
3.9897 1
 
< 0.1%
3.969 1
 
< 0.1%
3.9634 1
 
< 0.1%
3.962 1
 
< 0.1%
3.961 1
 
< 0.1%
3.96 1
 
< 0.1%
3.9592 1
 
< 0.1%
3.9577 2
 
< 0.1%
3.9565 1
 
< 0.1%

PercentageOfAbsence
Real number (ℝ)

Distinct1373
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.092939393
Minimum0
Maximum1
Zeros8873
Zeros (%)36.9%
Negative0
Negative (%)0.0%
Memory size188.2 KiB
2023-05-18T14:03:04.963209image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.04
Q30.11764706
95-th percentile0.39655172
Maximum1
Range1
Interquartile range (IQR)0.11764706

Descriptive statistics

Standard deviation0.14427427
Coefficient of variation (CV)1.5523479
Kurtosis8.33012
Mean0.092939393
Median Absolute Deviation (MAD)0.04
Skewness2.6523385
Sum2237.7947
Variance0.020815064
MonotonicityNot monotonic
2023-05-18T14:03:05.305469image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8873
36.9%
0.1111111111 224
 
0.9%
0.05882352941 224
 
0.9%
0.125 209
 
0.9%
0.05555555556 208
 
0.9%
0.06666666667 196
 
0.8%
0.0625 193
 
0.8%
0.09090909091 191
 
0.8%
0.1 190
 
0.8%
0.1666666667 182
 
0.8%
Other values (1363) 13388
55.6%
ValueCountFrequency (%)
0 8873
36.9%
0.007042253521 1
 
< 0.1%
0.007575757576 1
 
< 0.1%
0.007751937984 1
 
< 0.1%
0.008064516129 1
 
< 0.1%
0.008196721311 1
 
< 0.1%
0.008474576271 1
 
< 0.1%
0.008695652174 2
 
< 0.1%
0.008928571429 2
 
< 0.1%
0.009259259259 1
 
< 0.1%
ValueCountFrequency (%)
1 5
< 0.1%
0.9756097561 1
 
< 0.1%
0.9736842105 1
 
< 0.1%
0.9696969697 1
 
< 0.1%
0.96875 1
 
< 0.1%
0.9666666667 1
 
< 0.1%
0.9622641509 1
 
< 0.1%
0.9615384615 1
 
< 0.1%
0.96 1
 
< 0.1%
0.95 3
< 0.1%

Ethnicity
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
White, Non-Hispanic
16760 
Hispanic
3467 
Black, Non-Hispanic
1847 
American Indian or Alaskan Native
 
1107
Asian
 
472
Other values (5)
 
425

Length

Max length35
Median length19
Mean length17.754631
Min length5

Characters and Unicode

Total characters427496
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite, Non-Hispanic
2nd rowWhite, Non-Hispanic
3rd rowWhite, Non-Hispanic
4th rowHispanic
5th rowHispanic

Common Values

ValueCountFrequency (%)
White, Non-Hispanic 16760
69.6%
Hispanic 3467
 
14.4%
Black, Non-Hispanic 1847
 
7.7%
American Indian or Alaskan Native 1107
 
4.6%
Asian 472
 
2.0%
Unknown or Not Reported 134
 
0.6%
International 133
 
0.6%
unknown 105
 
0.4%
Native Hawaiian or Pacific Islander 48
 
0.2%
Not Hispanic or Latino 5
 
< 0.1%

Length

2023-05-18T14:03:05.630174image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:05.991581image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
non-hispanic 18607
39.0%
white 16760
35.1%
hispanic 3472
 
7.3%
black 1847
 
3.9%
or 1294
 
2.7%
native 1155
 
2.4%
american 1107
 
2.3%
indian 1107
 
2.3%
alaskan 1107
 
2.3%
asian 472
 
1.0%
Other values (8) 794
 
1.7%

Most occurring characters

ValueCountFrequency (%)
i 65089
15.2%
n 46803
 
10.9%
a 30492
 
7.1%
c 25129
 
5.9%
s 23706
 
5.5%
23644
 
5.5%
p 22213
 
5.2%
H 22127
 
5.2%
o 20551
 
4.8%
N 19901
 
4.7%
Other values (22) 127841
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 301708
70.6%
Uppercase Letter 64930
 
15.2%
Space Separator 23644
 
5.5%
Dash Punctuation 18607
 
4.4%
Other Punctuation 18607
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 65089
21.6%
n 46803
15.5%
a 30492
10.1%
c 25129
 
8.3%
s 23706
 
7.9%
p 22213
 
7.4%
o 20551
 
6.8%
e 19471
 
6.5%
t 18459
 
6.1%
h 16760
 
5.6%
Other values (9) 13035
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
H 22127
34.1%
N 19901
30.6%
W 16760
25.8%
A 2686
 
4.1%
B 1847
 
2.8%
I 1288
 
2.0%
U 134
 
0.2%
R 134
 
0.2%
P 48
 
0.1%
L 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
23644
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18607
100.0%
Other Punctuation
ValueCountFrequency (%)
, 18607
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 366638
85.8%
Common 60858
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 65089
17.8%
n 46803
12.8%
a 30492
8.3%
c 25129
 
6.9%
s 23706
 
6.5%
p 22213
 
6.1%
H 22127
 
6.0%
o 20551
 
5.6%
N 19901
 
5.4%
e 19471
 
5.3%
Other values (19) 71156
19.4%
Common
ValueCountFrequency (%)
23644
38.9%
- 18607
30.6%
, 18607
30.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 427496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 65089
15.2%
n 46803
 
10.9%
a 30492
 
7.1%
c 25129
 
5.9%
s 23706
 
5.5%
23644
 
5.5%
p 22213
 
5.2%
H 22127
 
5.2%
o 20551
 
4.8%
N 19901
 
4.7%
Other values (22) 127841
29.9%

Gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
Female
14889 
Male
9173 
unknown
 
16

Length

Max length7
Median length6
Mean length5.2387241
Min length4

Characters and Unicode

Total characters126138
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female 14889
61.8%
Male 9173
38.1%
unknown 16
 
0.1%

Length

2023-05-18T14:03:06.329041image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:06.595710image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
female 14889
61.8%
male 9173
38.1%
unknown 16
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 38951
30.9%
a 24062
19.1%
l 24062
19.1%
F 14889
 
11.8%
m 14889
 
11.8%
M 9173
 
7.3%
n 48
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
o 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 102076
80.9%
Uppercase Letter 24062
 
19.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 38951
38.2%
a 24062
23.6%
l 24062
23.6%
m 14889
 
14.6%
n 48
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
o 16
 
< 0.1%
w 16
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
F 14889
61.9%
M 9173
38.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 126138
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 38951
30.9%
a 24062
19.1%
l 24062
19.1%
F 14889
 
11.8%
m 14889
 
11.8%
M 9173
 
7.3%
n 48
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
o 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 126138
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 38951
30.9%
a 24062
19.1%
l 24062
19.1%
F 14889
 
11.8%
m 14889
 
11.8%
M 9173
 
7.3%
n 48
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
o 16
 
< 0.1%

IsHispanic
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
No
20514 
Yes
3494 
unknown
 
70

Length

Max length7
Median length2
Mean length2.1596478
Min length2

Characters and Unicode

Total characters52000
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
No 20514
85.2%
Yes 3494
 
14.5%
unknown 70
 
0.3%

Length

2023-05-18T14:03:06.813830image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:07.055354image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
no 20514
85.2%
yes 3494
 
14.5%
unknown 70
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o 20584
39.6%
N 20514
39.5%
Y 3494
 
6.7%
e 3494
 
6.7%
s 3494
 
6.7%
n 210
 
0.4%
u 70
 
0.1%
k 70
 
0.1%
w 70
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27992
53.8%
Uppercase Letter 24008
46.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 20584
73.5%
e 3494
 
12.5%
s 3494
 
12.5%
n 210
 
0.8%
u 70
 
0.3%
k 70
 
0.3%
w 70
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
N 20514
85.4%
Y 3494
 
14.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 52000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 20584
39.6%
N 20514
39.5%
Y 3494
 
6.7%
e 3494
 
6.7%
s 3494
 
6.7%
n 210
 
0.4%
u 70
 
0.1%
k 70
 
0.1%
w 70
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 52000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 20584
39.6%
N 20514
39.5%
Y 3494
 
6.7%
e 3494
 
6.7%
s 3494
 
6.7%
n 210
 
0.4%
u 70
 
0.1%
k 70
 
0.1%
w 70
 
0.1%

Target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size188.2 KiB
0
21163 
1
2915 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters24078
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Length

2023-05-18T14:03:07.254071image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-18T14:03:07.437802image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Most occurring characters

ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24078
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Most occurring scripts

ValueCountFrequency (%)
Common 24078
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24078
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21163
87.9%
1 2915
 
12.1%

Interactions

2023-05-18T14:02:53.181980image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:26.280765image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:28.912655image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:31.394965image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:33.905780image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:36.520370image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:40.185930image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:44.256711image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:47.315111image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:49.856340image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:53.432710image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:26.568360image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:29.169122image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:31.603511image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:34.153778image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:36.736323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:40.703464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:44.522768image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:47.583693image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:50.138033image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:53.692217image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:26.977858image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:29.395807image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:31.848024image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:34.401776image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:36.993390image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:41.253488image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:44.766110image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:47.840017image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:50.434045image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:53.975257image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:27.217725image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:29.603806image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:32.137479image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:34.585782image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:37.266009image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:41.807484image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:45.056750image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:48.100926image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:50.786034image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:54.284261image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:27.435563image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:29.857300image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:32.395226image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:34.805137image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:37.626323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:42.229023image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:45.446988image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:48.341641image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:51.090772image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:54.643214image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:27.704132image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:30.129944image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:32.659225image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:35.085135image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:38.012740image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:42.622432image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:45.797914image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:48.573265image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:51.447365image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:54.920854image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:27.967117image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:30.418536image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:32.899233image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:35.349136image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:38.347127image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:42.889840image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:46.122633image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:48.807289image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:51.807307image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:55.241561image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:28.207127image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:30.656944image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:33.151838image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:35.737727image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:38.738409image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:43.273612image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:46.405737image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:49.075740image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:52.176579image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:55.517400image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:28.432654image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:30.905618image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:33.400535image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:35.994759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:39.249082image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:43.656634image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:46.689878image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:49.308092image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:52.441966image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:55.777768image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:28.688655image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:31.169621image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:33.641775image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:36.254626image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:39.614357image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:43.955308image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:47.025782image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:49.601747image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-05-18T14:02:52.726936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2023-05-18T14:03:07.638183image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
StudentIDAttemptedAgeInternetPercentageOfRepeatsPercentageOfHistDropNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceFullDevTermCodeMajorChangedFromLastEthnicityGenderIsHispanicTarget
StudentID1.000-0.133-0.4720.062-0.171-0.280-0.263-0.1700.165-0.0540.0500.0070.4470.1130.0660.0660.0420.017
Attempted-0.1331.0000.2650.3770.1110.0990.1800.221-0.0780.0810.9010.1250.0650.1590.0490.0540.0260.161
Age-0.4720.2651.000-0.0830.2040.2920.4110.381-0.1660.0270.0830.0210.0420.1260.0790.0830.0850.016
Internet0.0620.377-0.0831.0000.0810.0370.017-0.0110.0060.0510.3720.0590.0500.0340.0220.0270.0000.155
PercentageOfRepeats-0.1710.1110.2040.0811.0000.3380.1410.076-0.3410.2600.2160.1590.0440.0570.0340.0170.0320.156
PercentageOfHistDrop-0.2800.0990.2920.0370.3381.0000.2910.219-0.2860.1730.0290.0900.0320.0840.0150.0100.0060.122
NumberOfMajors-0.2630.1800.4110.0170.1410.2911.0000.515-0.049-0.0450.1040.0480.1020.1080.0230.0720.0400.046
NumberOfUniqueMajors-0.1700.2210.381-0.0110.0760.2190.5151.000-0.104-0.0250.1400.0290.1240.4930.0300.0350.0290.033
CumGPALast0.165-0.078-0.1660.006-0.341-0.286-0.049-0.1041.000-0.3420.0620.1960.0370.0920.0560.0650.0450.163
PercentageOfAbsence-0.0540.0810.0270.0510.2600.173-0.045-0.025-0.3421.0000.0490.1580.0330.0430.0260.0490.0220.272
Full0.0500.9010.0830.3720.2160.0290.1040.1400.0620.0491.0000.1120.0700.1080.0950.0250.0190.134
Dev0.0070.1250.0210.0590.1590.0900.0480.0290.1960.1580.1121.0000.0590.0770.0620.0110.0200.132
TermCode0.4470.0650.0420.0500.0440.0320.1020.1240.0370.0330.0700.0591.0000.1220.0360.0490.0510.085
MajorChangedFromLast0.1130.1590.1260.0340.0570.0840.1080.4930.0920.0430.1080.0770.1221.0000.0110.0000.0040.041
Ethnicity0.0660.0490.0790.0220.0340.0150.0230.0300.0560.0260.0950.0620.0360.0111.0000.0750.6970.044
Gender0.0660.0540.0830.0270.0170.0100.0720.0350.0650.0490.0250.0110.0490.0000.0751.0000.0200.006
IsHispanic0.0420.0260.0850.0000.0320.0060.0400.0290.0450.0220.0190.0200.0510.0040.6970.0201.0000.000
Target0.0170.1610.0160.1550.1560.1220.0460.0330.1630.2720.1340.1320.0850.0410.0440.0060.0001.000

Missing values

2023-05-18T14:02:56.203312image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-18T14:02:56.968431image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

StudentIDAttemptedFullAgeDevInternetPercentageOfRepeatsPercentageOfHistDropTermCodeMajorChangedFromLastNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceEthnicityGenderIsHispanicTarget
010100118416.0127.0000.2500000.000000B17C0312.60000.020000White, Non-HispanicFemaleNo0
110100118410.0028.0100.0000000.000000B18Q1522.42860.230769White, Non-HispanicFemaleNo0
210200108217.0120.0020.2857140.047619B17C0413.84780.090909White, Non-HispanicFemaleNo1
310200242614.0123.0020.0000000.111111B17C1922.90570.236842HispanicFemaleYes0
410200242615.0125.0050.0000000.095238B19Q01332.88330.000000HispanicFemaleYes0
51020027579.0030.0001.0000000.083333B17C01113.36760.044444American Indian or Alaskan NativeFemaleNo0
61020027578.0030.0000.0000000.076923B17Q01213.29170.088235American Indian or Alaskan NativeFemaleNo0
710200275719.0131.0000.0000000.073171B18C11323.29170.078431American Indian or Alaskan NativeFemaleNo0
81020027576.0031.0000.0000000.062500B18Q11523.24470.111111American Indian or Alaskan NativeFemaleNo0
910200745813.0022.0000.0000000.000000B17C0410.94740.281250HispanicMaleYes1
StudentIDAttemptedFullAgeDevInternetPercentageOfRepeatsPercentageOfHistDropTermCodeMajorChangedFromLastNumberOfMajorsNumberOfUniqueMajorsCumGPALastPercentageOfAbsenceEthnicityGenderIsHispanicTarget
240681011275201212.0119.0000.00.0B23C0212.750.056604White, Non-HispanicMaleunknown0
24069101127520139.0036.0010.00.0B23C0214.000.038462White, Non-HispanicMaleunknown0
240701011275201412.0142.0010.00.0B23C0214.000.052632White, Non-HispanicMaleNo0
24071101127520319.0024.0200.00.0B23C0212.800.239130HispanicMaleYes0
240721011275207020.0120.0000.00.0B23C0213.800.000000White, Non-HispanicMaleNo0
240731011275207220.0133.0000.00.0B23C0314.000.000000White, Non-HispanicMaleNo0
240741011275207720.0125.0000.00.0B23C0214.000.000000American Indian or Alaskan NativeMaleNo0
240751011275209920.0129.0000.00.0B23C0213.800.000000White, Non-HispanicMaleNo0
240761011275213820.0120.0000.00.0B23C0214.000.000000White, Non-HispanicMaleunknown0
24077101127521409.0024.0100.00.0B23C0214.000.294118White, Non-HispanicMaleunknown1